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Abstract 

Matrix Code gives imperative programming a mathematical seman- 
tics and heuristic power comparable in quality to functional and 
logic programming. A program in matrix code is developed incre- 
mentally from a specification in pre/post-condition form. The com- 
putations of a code matrix are characterized by powers of the ma- 
trix when it is interpreted as a transformation in a space of vectors 
of logical conditions. Correctness of a code matrix is expressed in 
terms of a fixpoint of the transformation. 

Categories and Subject Descriptors D.1.4 [Programming Tech- 
niques]: Sequential Programming; D.2.4 [Software/Program Ver- 
ification]: Correctness Proofs; D.3.3 [Programming Languages]: 
Language Constructs and Features — Control Structures; F.3.3 
[Studies of Program Constructs]: Control Primitives 

General Terms program verification, programming methodology 

Keywords Floyd assertions, Hoare logic, verification-driven pro- 
gramming 

1. Introduction 

By imperative programming we will understand the writing of code 
in which the state of the computation is explicitly manipulated by 
assignments that change the value of a variable. As a programming 
paradigm, imperative programming should be compared, and con- 
trasted, with functional and logic programming. Compared to these 
latter paradigms, imperative programming is in an unsatisfactory 
state. At least as a first approximation, a definition in functional 
or logic programming is both a specification and is executable. In 
imperative programming proving that a function body meets its 
specification is such a challenge that it is not considered part of 
a programmer's task. Another difference, probably related, is that 
functional and logic programming have an elegant mathematical 
semantics in which the behaviour of a definition is characterized as 
a fixpoint of the transformation associated with the definition. 

This paper is a contribution to imperative programming in the 
form of a new language, called Matrix Code, in which programs 
take the form of a matrix with binary relations among states as 
entries. Matrix Code is distinguished by a development process 
that begins with a null code matrix, progresses with small, obvious 
steps, and ends with a matrix that is of a special form that is trivially 
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translatable to a conventional language like Java or C. The result of 
the translation has the same behaviour as the one determined by 
the mathematical semantics of the code matrix. Therefore the latter 
can be said to be executable. As every stage in the development 
process is partially correct with respect to the specification (the 
correctness of the initial null code matrix is very partial) Matrix 
Code comes close to the ideal in which the code is itself a proof of 
partial correctness. 

Plan of the paper In Section [2] we give a small example of the 
verification of imperative code by Hoare's method. We note certain 
features that point in the direction of Matrix Code. In Section [4] 
we define this language and show the same example translated to 
it. In Section [5] we explain how a code matrix is executed and we 
define its set of computations. In Section [6] we use the fact that a 
code matrix is not only an executable program but also a set of 
verification conditions to characterize partial correctness in terms 
of a fixpoint of the matrix. In Section [8] we solve the problem of 
Section|2]in the systematic manner that is unique to Matrix Code. 
The final two sections survey related work and draw conclusions. 

2. Hoare's verification method 

As an introduction to the verification method for imperative pro- 
gramming due to Hoare [ 10] we verify a Java version of the prime- 
number generating program developed by Dijkstra in 1 6] . The Java 
version of this program is shown in FigureQ] 

We think of a computation as a sequence of computation states 
each of which consists of a control state (a code location) and a 
data state (a tuple of values of the variables^]. According to Hoare's 
method, conditions are attached to code locations. The conditions 
assert that certain relations between program variables hold at the 
code locations. When such a condition occurs in a loop, it is the 
familiar invariant of that loop. In Figure [7] we have indicated by 
the comments S, A, B, C, and H where these conditions have to be 
placed. Figure|2]contains the corresponding conditions. 

The verification of the function as a whole relies on the verifica- 
tion of a number of implications defined in terms of conditions and 
program elements such as tests and statements. Consider FigureQ] 
because there is an execution path from A to B, one has to show the 
truth of 

{A && k<N> j=p[k-l]+2; n=0; {B>, 
which has as meaning: if A && k<N (the precondition) is true and 
if 

j=p[k-l]+2; n=0; 
is executed, then B (the postcondition) is true. Because of the three 
elements: precondition, postcondition, and the item in between, this 
is called a Hoare triple. Figure [2] contains not only the conditions 



In this paper we consider only code that executes in a single activation 
record. 



public static void primes (int [] p, int N) { 
// S 

int j , k , n ; 

p[0] = 2; p[l] = 3; k = 2; 
// A 

while (k<N) { 

j = p[k-l]+2; n = 0; 
// B 

while (p[n]*p[n] <= j) { 
// C 

if Cjy.pCa+1] != 0) n++; 
else {j +=2; n = 0;} 

> 

p[k++] = j; 

> 

// H 

} 



Figure 1. A Java function for filling p[0. .N-l] with the first N 
primes. At the points indicated by the comments S, A, B, C, H 
we need conditions to allow verification by Hoare's method. The 
identifiers and the structure are the same as in Dijkstra's example 
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relB(p,k,n, j)} means that there is no prime 
between p[k-l] and j, and that j is not divided 
by any prime in p [0 . . n] , and that n<k . 



Hoare triples: 

{S} p[0]=2; p[l]=3; k=2; {A} 

{A kk k >= N} {H} 

{A && k < N} j=p[k-l]+2; n=0; {B} 

{B kk p[n]*p[n] <= j} {C} 

{B kk p[n]*p[n] > j} p [k++] = j {A} 

■CC kk j'/„p[n+l] != 0} n++ {B> 

{C kk j'/.p[n+l] == 0} j += 2; n = {B> 



Figure 2. Conditions and Hoare triples for Figure[7] The meaning 
of a Hoare triple {AO} CODE {Al} is that if condition AO is true 
and if CODE is executed with termination, then condition Al is true. 



for Figure [T] but also the set of verification conditions in the form 
of Hoare triples. 

The term "condition" for the type of thing that occurs as pre- 
condition and postcondition in a Hoare triple is, in our view, rather 
compelling. However, it seems that in certain contexts "assertion" 
is a more natural alternative term. In this paper we will use both. 
At the same time, one should make a distinction between the con- 
dition as a linguistic expression and the set that is the meaning of 
that expression. We trust no confusion arises as we use "assertion" 
and "condition" interchangeably for both the expression and the 
meaning. 

Figure Q] may seem to be the obvious, or even only, solution to 
the problem. But instructors in a beginners' programming course 
will see a wondrously creative variety of alternative solutions. Be- 
ing solutions, they can all be verified by the same set of triples as in 
Figure [2] What all these solutions also have in common is the flow 
chart, and this flow chart is also verified by the same set of triples. 
In this sense the flowchart is a language-independent notation for an 
algorithm that also accommodates Hoare's verification method. In 
fact, the method originates with Floyd [8], who introduced it with 
flow charts. In spite of their merit of language-independence and 
verifiability we are not satisfied with flow charts because of their 
lack of heuristic power and because the lack of an attractive math- 
ematical model. Flow charts are interesting because they are only 
a small step away from Matrix Code, which does have these two 
properties. We describe this step in the remainder of this section. 

It would be tempting to say that, once we have a sufficient 
set of Hoare triples, we can forget the program in Figure [T] all 
information about it is in the Hoare triples of Figure [2] This may 
seem so because, for example, in 

{A kk k < N} j=p[k-l]+2; n=0; {B} 
A stands for the condition defined earlier in that figure. What is 
missing is the fact that condition A is tied to code location A. We 
need the preconditions to be identified by a single letter that stands 
for a condition, so that all triples have the form {P}S{Q}. 

We will show that an algorithm as set of triples of the form 
{P}S{Q} has an attractive mathematical model and has consider- 
able heuristic power. Assuming then that all the information about 
an algorithm is in a set of items of the form {P}S{Q}, what is a 
convenient format for such a set? The most obvious seems a graph 
where the nodes represent conditions and where the directed edges 
are labeled with the middle items of the triples. Such a graph is also 
often used to represent a sparse matrix. A disadvantage of the ma- 
trix format is that it takes up an amount of space that is quadratic in 
the number of nodes. However the mathematical model that we pro- 
pose for an algorithm as set of triples of the form {P}S{Q} is that 
of a transformation in a certain type of vector space of conditions. 
We are used to having such transformations represented by matri- 
ces rather than graphs. As in other uses of such transformations, 
matrix multiplication is a familiar and fundamental operation. For 
most people graph multiplication, though perfectly well defined, is 
not familiar. Hence we opt for the matrix representation and use 
Matrix Code as name for an algorithm as set of triples of the form 
{P}S{Q}. 

But we are running ahead of the story: this is only relevant if we 
can get all triples in the form {P}S{Q}. We do this by generalizing 
the S from a statement to a binary relation between data states. 
Because of the essential role of binary relations we review and 
introduce the needed terminology and notation. 



3. Preliminaries on binary relations 

As binary relations are essential to Matrix Code we review notation 
and terminology. For the purposes of this paper, a binary relation R 
on a set D is a subset of the Cartesian product D x D. If (so, si) 



is in a binary relation, then we say that so is an input; si an output 
of the relation. 

The null relation is the empty subset of D x D. The identity 
relation In on D is {(so,si) £ D x D : so = si}. The union 
Ro U i?i of binary relations Ro and Ri is defined to be their 
union as subsets of D x D. The composition Ro; Ri of binary 
relations 7?o and Ri is {(so, si) £ D x B : 1 6 D. (so,£) £ 
i?o A (t, si) £ The inverse R^ 1 of a binary relation R is 

{(t,s) G O x D : (s,t) £ R}. 

Let us call subsets of D conditions, anticipating their future use. 
The left projection of a binary relation R is defined as the condition 
{x £ D : 3y £ D. (x,y) £ R}. Dually, the right projection of 
a binary relation R is defined as the condition {y £ D : 3x £ 
D.(x,y)£R}. 

We generalize In to I c , which means, for any condition c C D, 
by definition, {(a;, a;) £ DxD : x £ c}. This induces a one-to-one 
relation between c and I c : 

x £ c -o (a;, x) £ I c . 

Accordingly, at times we view a condition (alias assertion) as a 
subset of D; at times as a subset of In- 

DEFINITION 1 . Given a condition p C D and a binary relation 
R f= (D X D), we write {p}R for the right projection ofI(p);R, 
where I(p) is the binary relation {(x, x) £ D X D : x £ p}. 

Hoare triples were intended to be applied to program state- 
ments. However, they have a natural interpretation for binary re- 
lations, as follows. 

DEFINITION 2. The Hoare triple {p}R{q} holds iff 
{p}R C q. 

DEFINITION 3. A trace of a relation R C [D x D) is a possibly 
infinite sequence of elements of D such that for any pair (s, s ) £ 
(D x D) such that s' follows s in the sequence we have that 
{s, a') £ R. 

A trace so ■ ■ ■ s n _i is closed iff there is no d £ D such that 
(s„-i,d) £ R. 

A segment [a, u] of a trace is a contiguous subsequence of the 
trace; a £ D is the first, uj £ D is the last element in the segment. 

4. Matrix code 

As a first step toward matrix code we modify the nature of the 
middle term of the Hoare triple {P}T{Q}. Conventionally T is a 
statement of a conventional language, typically changing the value 
of one or more variables. 

Let us regard the collection of all variables accessible to the 
code as a tuple of the values indexed by the names of the variables. 
We call this tuple the data state. Thus, in Figure [7] the data state 
consists of the array p and the variables k, n, j . 

The effect of a statement can be modeled as a binary relation on 
data states. Expressed in terms of sets, such a relation R is the set of 
pairs (a;, y) such that (x, y) £ Riff R's output y is a possible data 
state after executing the statement R, when R's input x is the data 
state before. This captures all terminating statements. In particular, 
an assignment statement v = E corresponds to the binary relation 
consisting of pairs (x, y) of data states where the v component of 
y is equal to the result of evaluating E, and all other components of 
y are equal to the corresponding ones in x. 

We may have that 7? is not single- valued: there may exist x, yo, 
and j/i such that (x, yo) £ R and (x, yi) £ R and yo yi - That 
is, we admit nondeterministic statements, so yo is a possible output 
rather than the output. Modeling a statement as a relation 7? allows 
us to account for another computational phenomenon: it may be 



that for some x there is no y such that [x, y) £ R. This expresses 
the fact that for some data states as input the effect of the statement 
is not defined. For example, if the input data state x is such that in 
this data state w = 0, then for the relation modeling the statement 
u := u/w there is no corresponding output y. But of course 7? can 
be & function on the set of data states so that it is defined for every 
state as input and that for each of these there is one and only one 
output. 

We modeled the middle term T, which is conventionally a state- 
ment, in {P}T{Q} as a binary relation. We now generalize T by 
allowing it to be any binary relation over data states. We call this 
generalization of T a transition. 

A transition may denote the empty relation. The identity relation 
on a set A of data states is {(s, s) : s £ A}. A transition in the form 
of a boolean expression b denotes a subset of the identity relation. 
Such a transition we call a guard, following Q]. 

Thus guards can be composed with other transitions. If n and 
V2 are the meanings of transitions t\ and t%, then r\;ri is the 
meaning of t\\ti, which is the transition consisting of the execution 
of ti followed by the execution of t-2- But either or both of t\ and 
<2 may be a guard, and then the effect of ti ; ti is equally well 
determined by the definition of composition of binary relations. 
The interpretation of transition elements allows the composition 
of a guard with any transition, whether that is a guard or not. 
For example, v — ; v > and v > 1 ; v — are both well-defined 
transitions. 



Conditions : 

-CS> p[0]=2; p[l]=3; k=2; {A} 
{A} k >= N {H} 

{A} k < N; j=p[k-l]+2; n=0; {B} 

{B> p[n]*p[n] <= j {C} 

{B> p[n]*p[n] > j; p[k++] = j {A} 

-CC> j'/.p[n+l] != 0; n++ {B> 

{C> j'/.p[n+l] ==0; j +=2; n = {B} 



Figure 3. Hoare triples for Figure Q] The middle terms in the 
verification conditions are transitions. 



The purpose of the introduction of transitions is that we can 
write the verification conditions of Figure [2] as in Figure |U We 
introduce a new programming language so that Figure [3] is itself a 
program and so that Figure[5]is also the verification of that program. 

A natural notation of a set of items of the form {P}T{Q} is a 
matrix with rows and columns labeled by conditions. In this way 
the verification conditions of Figure [3] become the code matrix 
in Figure [T2] a program in Matrix Code, the language. As we 
customarily do, the empty row of the start label and the empty 
column of the halt label have been omitted. 

DEFINITION 4. Given a set L of labels, a tuple of variables, 
boolean expressions testing a relation among the subset of these 
variables, and statements defined on a subset of these variables. 
A code matrix consists of an L-by-L matrix M, an L-indexed row 
vector of conditions preceding the sequence of rows of M, and an 
L-indexed column vector of conditions following the sequence of 
columns of M. For all i and j in L the element Mij of column i 
and row J3 is a transition, an expression denoting a binary rela- 
tion. Among the labels there is one that labels an empty row; this is 
the start label. Among the labels there is one that labels an empty 
column; this is the halt label. 



2 Note the transposition from the usual order. In this way the direction of 
execution is from i to j. 



A code matrix is a way of writing a set of verification condi- 
tions, so has the status of a formula of logic. Yet it is also a pro- 
gram for a suitably defined abstract machine. This will be proved 
by defining the computations of a code matrix. 

Matrix Code is a programming language that relies on an un- 
derlying base language in which to define the types of the variables 
and in which to write the statements and boolean expressions that 
make up the transitions. Matrix Code is defined informally, as done 
here. The fragments of base language that are needed are defined 
according to the standard of Java or C, as the case may be. 

Some conventions for writing a code matrix: if a cell contains 
the null relation as a transition, then nothing is written in the cell; 
if a row or column is empty, it is omitted. 

5. The computations of a code matrix 

Section [4] gave a syntax of matrix code. We now define its opera- 
tional semantics by defining execution of a code matrix. 

DEFINITION 5. A computation state is a pair (l,v) where I is a 
control state (in the form of a label) and v is a data state (in the 
form of a tuple of values of variables). 

Execution of a code matrix consists of the execution agent 
performing a sequence of cycles. The agent carries a computation 
state which is updated during a cycle. At the beginning of the cycle 
the agent carries state (I, v). It enters from the top of the matrix 
through the column labeled by I until it encounters a non-empty 
cell. Let r be the row in which this cell occurs and let R be the 
relation modeling the transition in this cell. If the data state v of the 
agent is such that there is a (v, w) £ R, then the agent exits to the 
right with computation state (r, w). This completes the cycle, and 
the agent begins a new cycle unless it exited through row H. 

The agent may start a cycle in a column that does not contain a 
transition having its data state as input. In that case the agent does 
not complete the cycle. 

Initially the agent carries the control state S. If and when the 
control state changes to H, execution halts with success. 

DEFINITION 6. The binary relation associated with a code ma- 
trix M with set E of computation states is the set of pairs 
((l,v),(l' ,v')) £ E X E such that (v,v') £ M u i, the element 
in column I and row I of M. 

A computation of AI is a trace of the binary relation associated 
with M. 

A computation is closed ;/ it is closed as a trace. 
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Figure 4. Example of the trace for N equals 3 of the code matrix in 
Figure [121 



If a row is empty, then its label can only occur in the first state 
of any computation. Such a label is the start label. If a column is 
empty, then any computation state containing its label has to termi- 
nate the computation. Such a label is the halt label. In Figure[l2] S 
is the start label and H is the halt label. 

DEFINITION?. A computation is successful if it is closed and if 
its last computation state contains the halt label as control state; 
otherwise a closed computation is failed. 

DEFINITION 8. Let M and N be code matrices with the same set 
L of labels and the same set A of data states. The product M N 
of M and N is a code matrix with L as set of labels and A as set 
of data states and with the cell (MN)ik in column i and row k 
containing \J jeL Mij-,N jk . 

Let I be the L-labeled matrix of binary relations over A that 
has the identity relation on A on the main diagonal and the empty 
relation elsewhere. Then we have IM = MI — M with M any L- 
labeled matrix with binary relations over A as elements. We write 
M" for M n ~ 1 M for a positive integer n while M° = /. 

Matrix code can be viewed as a format for defining new binary 
relations in terms of the binary relations given by the statements 
and boolean expressions of the base language. 

DEFINITION 9. The relation computed by a code matrix M with 
start label S and halt label H is defined to be the set of (s, t) in 
A X A such that there exists a computation of M that starts with 
(S, s) and ends with (H, t). 

We characterize the relation computed by a code matrix in terms 
of its powers. First two lemmas concerning these powers. 

LEMMA I. If a code matrix M has a computation containing a 
segment [(I, s), (V , s )], then there exists an n such that (s, s') £ 

Proof We proceed by induction on the segment length k. If k — 1 
the computation has the form (I, s), (l' , s ), so that (I , s') is the 
successor of (I, s). By Definition|6]we have (s, s') £ M ; ; /. 

We assume the lemma true for k. 
(I, s), (h, si), . . . , {lk-i, Sfc-i), (I', s') is a computation implies 
that there exists an n such that (s, Sk—i) £ M^, (by the induc- 
tion hypothesis) and (sk-i,s') £ M;._ li ;/. By Definition [8] this 
implies that (s, s') £ M^t 1 . 

LEMMA 2. (s,s') £ (M n )ni implies that there exists a segment 
[(I, s), (I , s )] of a computation of M; this holds for all n = 
1,2,... 

Proof We proceed by induction on n. (s, s') £ M i ; / implies 
that [(I, s), (I 1 , s')] is a segment. This takes care of the base case 
n=l. 

Assume the lemma for n. 
(s, s") £ {M n+1 )i t in implies that there exists an s' and an I' 
such that (s,s') £ (M n ) l<v and (s',s") £ M v d „ by Defini- 
tion [8] Hence, by the induction assumption there exists a segment 
[(I, s), (l', s')] and (s',s") £ M;',;", which implies, by Defini- 
tion[6] that there exists a segment [(I, s), (I" , s")] of a computation 
ofM. 

THEOREM 1 . Suppose that M is a code matrix with start state S 
and halt state H, with a finite set of labels, and a finite set of data 
states. Then the relation computed by M is \^J°^ (M 1 )s,h. 

Proof Suppose that the pair (s, t) of data states is in the rela- 
tion computation computed by M. By Definition [9] there exists a 
computation of M that begins with (S,s) and ends with (H,t). 



According to Lemma [T] there is an n such that (s, t) G (M n )s h- 
Hence (s,t) G \JZ (M n )s.H. 

Suppose that (s,t) G \J^_ (M n )s,H- By the finiteness as- 
sumptions there exists an n such that (s, t) G (M n )s,H- According 
to Lemma|2]this implies that there exists a computation of M that 
begins with (S,s) and ends with (H,t). Therefore (s, t) is in the 
relation computed by M, according to Definition[9] 

6. Verification of matrix code 

If the matrix in matrix code is in a certain relation with its row 
vector of preconditions and column vector of postconditions, then 
its computations are partially correct. In this section Theorem [2] 
makes this claim precise. 

Conditions Transitions can be regarded as transformations of a 
single input to a single output. A transition can also be regarded as 
a condition transformer, transition T transforms condition p into 
the condition {p}T. 

We characterize transitions by conditions of the form {p}T C q. 
According to Definition[2]this is written as {p}T{q}. This notation 
was introduced with T as a binary relation in general. When T is 
the relation computed by a code matrix, this implies that condition 
p does not imply termination. Hence the correctness expressed by 
{p}T{q} is partial correctness. 

In case the code matrix is nondeterministic there may be data 
states in p that begin computations that end in different data states; 
{p}T{q} implies that these final data states are all in q. 

Condition vectors The transitions that are the elements of a code 
matrix define transformations on individual conditions. The matrix 
as a whole defines a transformation on condition vectors: vectors of 
conditions indexed by labels. The computations of a matrix have as 
elements computation states, which have the form (I, v), where I is 
a label and v is a data state. A set P of computation states defines 
a condition vector C by Ci = {v G A : (I, v) G P} for all I G L. 
Conversely, C can be used to define P — (J l€L {(l, v) : v G C;}. 
As the two correspondences are each others' inverse, condition 
vectors are isomorphic to sets of computation states. 

DEFINITION 10. The expression {P}M{Q} asserts that 
({P}M) C Q, 

where {P}M is the condition vector of which the i-th element is 
{Jj eL I(Pj)', Mij, for all i G L. Here 1(C) is defined as the 
following subset of the identity relation on data states: {(s, s) G 
A x A : C is true in s}. 

THEOREM 2. Given a code matrix M and a condition vector V 
satisfying {V}M{V}. For any computation state (I' , s') of any com- 
putation beginning with (I, s) such that s G Vj it is the case that 
s' G V v . 

Proof 

We proceed by induction on the length n of the computation. If 
n — 1 (one state in the computation) we have (l',s') = (I, s). 
Assume the theorem true for computations of length n. Consider 
the computation 

(I, s), (h,Sl), (Z»-l, Sn-l), (f, s'). 

By the induction assumption s n _i G Vi n _ 1 . By Definition [6] 
(s n -i,s') G Mi It is given that {V}M{V}, hence in par- 

ticular that {Vi 11 _ 1 }M 1b _ 1j1 /{Vi/}. It follows that s' G Vv, which 
establishes the theorem for the computation of length n + 1. 



7. Mathematical semantics 

A condition vector F such that {F}M = F is a fixpoint of M when 
M is regarded as a transformer of condition vectors. Typically F is 
such that it has a compact description by means of boolean expres- 
sions. This is so because it derives from a program specification. 
But there is no reason to believe that this holds for {F}M. What 
makes Floyd's method useful is that it does not require finding a 
fixpoint of M, but only requires a solution V of {V}M C V such 
that Vs is the condition at the start node and Vh the one at the halt 
node. 

The fact that M, a monotonic transformation, is guaranteed, by 
the Knaster/Tarski theorem, to have a fixpoint is of no practical 
significance for two reasons. In the first place, we are not interested 
in verifying a given code matrix: the reason for using matrix code 
is that it helps us discover a program satisfying the specification. In 
the second place, the iterative algorithm that proves the existence of 
a fixpoint is not practically executable. The resulting fixpoint (the 
least such) is unlikely to have an intelligible description. 

Condition vectors are an example of a semimodule, a general- 
ization of the familiar vector space. In a vector space the scalars are 
elements of a field (e.g. the reals). If the scalars are generalized to 
elements of a ring (e.g. the integers), the vector space becomes a 
module. The analog of addition of integers is union of binary rela- 
tions. Union does not have an inverse, so that we find that binary 
relations over a given domain are a semiring. The corresponding 
generalization of a vector is a semimodule. Thus the mathemati- 
cal model of imperative programming obtained by Matrix Code is 
given by the theory of semimodules. 

Semimodules are important in pure mathematics. For their role 
in programming we refer to Parker's monograph 1 12]. Parker de- 
fines a general framework, partial-order programming, which cap- 
tures numerical optimization problems as well as functional and 
logic programming. As Parker shows, partial-order programming 
can take as special form semilinear partial-order programming, 
where the partially-ordered spaces take the form of semimodules. 
Examples of problems that find a natural formulation as transfor- 
mations in semimodules expressible by matrices are: path reliabil- 
ity, path connectivity, maximum capacity paths, k-shortest paths, 
regular expressions, word abbreviations, path and cutset enumera- 
tion, and certain scheduling problems. In this paper we show that 
Matrix Code is semilinear partial-order programming, thereby in- 
heriting a rich theory and sharing algebraic properties with many 
important applications. 

8. Systematic program development 

Floyd's method is difficult to apply because it is difficult to find 
the required conditions even when the program is correct. Because 
of this Dijkstra |4, 5] advocated parallel development of code and 
proof. In this section we demonstrate parallel development of a 
code matrix for the sample problem solved in Figure [T] to fill an 
array with the first N prime numbers in increasing order. 

Background on prime numbers Before we start, let us review 
what we need to know about prime numbers. The following list of 
facts is not intended as a complete or nonredundant set of axioms; 
they are a selection to guide us in the choice of conditions and 
transitions. 

1. A prime is a positive integer that has no divisors. (We do not 
count 1 or the integer itself as divisors. Moreover, 1 is not a 
prime.) 

2. There are infinitely many primes, so the problem can be solved 
for any n. 

3. 2 and 3 are the first two primes. So a way to get started is to 
accept these as given and place them in the beginning of the 



table. This has the advantage that we always have the situation 
where the last prime in the table is odd and the next odd number 
is the first candidate to be tested for the next prime. 

A. If a number has a divisor, then it has a prime divisor. This can 
be used to save effort: we only have to test for divisibility by 
smaller primes, and these are already in the table. 

5. If a number lias a divisor, then it has a prime divisor less than or 
equal to its square root. This implies that we do not have to test 
the candidate for the next prime for divisibility by all primes 
already in the table. 

6. The square of every prime is greater than the next prime. The 
significance of this fact will become apparent as we proceed. 

Deriving the code matrix The distinctive advantage of matrix 
code is that a matrix can be expanded from the specification in 
small steps using only the logic of the application without needing 
to attend to the control component of the algorithm. Thus matrix 
code is an example of Kowalski's principle "Algorithm = Logic + 
Control" (H. 

We assume that the specification exists in the form of a precon- 
dition and a postcondition. This gives rise to code matrix with one 
row and one column; the one in Figure[5] 



S: p[0. 


N-l] 




exists I 


£ N>1 








H: p[0. .N-l] 


/♦which 


T?*/ 


contains the 


first N primes 



Figure 5. There is only an empty transition T such that {S}T{H}. 

The one element of this matrix is the transition T such that 
{S}T{H} is true. That is, T has to be a simple combination of 
guards and assignment statements that places the N first primes in p, 
whatever N is. Absent such a T, we leave the matrix cell empty. The 
resulting code matrix satisfies {S}T{H}, which makes it partially 
correct, but very partially so: it has no successful computations. 
Although Figure |5]is the correct start of the development process, 
it is not the last step. 

As it is too ambitious to place all primes in the array with a 
single transition, a reasonable thing to try is to fill it with the first k 
primes and then try to add the next prime after p [k-1] . 

We need a condition A that is intermediate in the sense that 
{S}T1{A} and {A}T2{H} for simple Tl and T2. Such a condition 
is: the first k primes in increasing order are in p[0. .k-1] with 1 
< k <= N. 

Condition A is promising because it is easy to think of such a Tl 
and such a T2. The result is in Figure[6] 

This again is a partially correct code matrix. It is a slight im- 
provement in that it solves the problem if N happens to be one or 
two. In all other cases it leads to failed computations. The difficulty 
is that in column A we may have that k < N, so that we cannot make 
the transition to H. We need to find the next prime after p[k-l] . 
Let j be the current candidate for this next prime. That suggests for 
condition B : A is true and j is such that there is no prime greater 
than p [k-1] and less than j . Moreover, j is not divisible by any of 
p[0. . n] . This condition is abbreviated to relB(p,k,n, j). It is a 
useful condition, as there is a simple transition that makes this true. 

In the new column B it is easy to detect whether n is large 
enough to conclude that j is the next prime after p [k-1] . We place 
the corresponding transition in column B and we have Figure [TT] 



S: p[0. .N-l] 
exists & N>1 



H: p[0. .N-l] 
contains the 
first N primes 



A: p[0. .k-1] 
p[0] = 2; p[l] contains the 
=3; k = 2 first k primes 

& k <= N 



Figure 6. In column A the case k < N is missing. 

There are still failed computations. (In fact, there is still no 
way to get beyond N = 2.) The way ahead is clear: a transition 
is missing in column B, for the situation where n is too small to 
conclude that j is the next prime. That in itself produces condition 
C and, with it, a new row and column. 

In column C the missing information is whether j, the candidate 
for the next prime, is divisible by p[n+l] . If not, then n can be 
incremented, and condition B is verified. If so, then j is not a prime 
and the search for the next prime must be restarted with j+2. This 
determines a transition in column C that verifies condition C, so is 
placed in that row. See Figure [T2l 

Up till now we detected with every additional row and column 
that the new column lacked a transition. Not this time: none of the 
columns has a missing transition. The code matrix has no failed 
computations. So it gives the correct answer by exiting in row H, 
or it continues in an infinite computation. As we have only proved 
partial correctness, this latter alternative remains a possibility. 

Termination For an infinite computation to arise, there must be 
at least one condition that is revisited an infinite number of times. 
For each condition we give a reason why it can only be revisited a 
finite number of times. 

1 . Condition A . For this condition to be returned to, k has to have 
increased. 

2. Condition B . For this condition to be returned to, n or j has to 
have increased. 

3. Condition C . For this condition to be returned to, n has to have 
increased. 

The transitions have been chosen so that the corresponding 
revisiting condition is satisfied. As none of these conditions can be 
satisfied an infinite number of times, the code matrix has no infinite 
computation. 

Running matrix code Running a code matrix in current practice 
requires translation to a currently available language. Our exam- 
ples of matrix code have been constructed for ease of translation 
to languages like lava or C. This entails a drastic reduction in ex- 
pressivity. Let us now demonstrate translation using Figure [T2l as 
example. 

As there is a similarity between the control states and the states 
of a finite-state automaton (FSA), a good starting point for system- 
atic translation of a code matrix is the pattern according to which an 
FSA is implemented. This is usually done by introducing a constant 
for every state and to let a variable, say, state assume these con- 
stants as values. An infinite loop containing a switch controlled 
by state then contains a case statement for every control state. 
The fact that in a programming language the case statements are 



not restricted to input or output is the generalization that produces 
a code matrix from an FSA. 

Each column of a code matrix translates to a case statement. 
The order in which the translations of the columns occur does not 
matter as long as state is initialized at S. Here we have arbitrarily 
chosen alphabetic order. In this way Figure [12] translates to the 
following. 



public static void primesCM(int [] p, int N) { 
final int S=0, A=i, B=2, C=3, H=4; 
int state=S; // control state 
int j=0, k=0, n=0; // data state 
while (true) { 
switch (state) { 
case A: 

if (k >= N) state = H; 
else {j = p[k-l]+2; n = 0; state = B;} 
break; 

case B: if (p[n]*p[n] > j) { 

p[k++] = j; state = A; 
} else state = C; 

break; 
case C: 

if (j"/,p[n+l] != 0) {n++; state = B;} 
else {j +=2; n = 0; state = C;} 
break; 

case H: return; 

case S: p[0] = 2; p[l] = 3; k = 2; state = A; 
break; 

} 

} 

} 



Figure 7. Translation of the code matrix in Figure [721 to Java. 



A transition b0;S0 in column X and row Ro and transition 
! bO ; SI in column X and row R\ translate to case X : if (bO) 
{SO; state = RO;} else {SI; state = Rl} break; in the 
above code. 

9. Expressiveness of matrix code 

The Java code obtained by translating a code matrix is quite differ- 
ent from what one conventionally would write: compare Figure [JJ 
with Figure [7] In this example Matrix Code has the advantage of 
being a verification and of being easy to discover. But in the prime- 
number problem Matrix Code does not lead to a more efficient pro- 
gram: it has the same set of computations as the conventional one. 

In this section we present an example where Matrix Code makes 
it easy to discover an algorithm that is more efficient than what 
is obtained via the conventional programming style. Consider the 
merging of two monotonically nondecreasing input streams into a 
single output stream. We have available the following C++ func- 
tions. 

bool getL(int& x) ; // output parameter x 
bool getR(int& x) ; // output parameter x 
void putLO ; 
void putRO ; 

where getL (getR) tests the left (right) input stream for emptiness. 
In case of nonemptiness the output parameter x gets the value of the 
first element of the stream. Neither getL nor getR change any of 
the streams. This is only done by the functions putL ( ) and putR ( ) 



which transfer the first element of a nonempty left or right input 
stream to the output stream. 

Figure[8]is a typical program for this situation. It typically acts 
in two stages. In the first stage both input streams are nonempty. In 
the second stage one of the input streams is empty so that all that 
remains to be done is to copy the other stream to the output. 



void eMergeQ { 
int u,v; 

while (getL(u) && getR(v)) 
if (u <= v) putLO ; 
else putRO ; 

while (getL(u)) putLO; 

while (getR(v)) putRO; 

} 



Figure 8. A structured program for merging two streams. 



This algorithm performs unnecessary tests: in the first stage only 
one of the input streams is changed, so that only that one needs to 
be tested for emptiness; here both are testerfj. It is superfluous tests 
like this that allow the algorithm to be as simple as it is. 

Of course it is unlikely that it is important to save the kind of test 
just mentioned. But there are many types of merging situations and 
there may be some in which it does matter. An advantage of matrix 
code is that it does not bias the programmer towards including 
superfluous tests. 

We proceed to develop a code matrix for merging. The asser- 
tions need to indicate whether it is known that an input stream is 
empty and, if not, what its first element is. If an input stream is 
possibly empty then we represent it by "?". We write "e" if an in- 
put stream is empty. Nonemptiness is indicated by writing "x : ?", 
where x is the first element. We have to do this for each of the in- 
put streams; we write e.g. the assertion (u:?,v:?) to mean that 
both input streams are nonempty and have first elements u and v, 
respectively. 

With these conventions we can state the program's specification 
as obtaining a transition from the state S, which is ? : ?, to the 
state H, which is e:e, while maintaining the invariant that the 
result of appending the output stream to the result of merging the 
input streams is constant. Accordingly, the development starts with 
Figure [9] 



S: (?,?) 




/*which T?*/ 


H: (e,e) 







Figure 9. Matrix code corresponding to specification of the merg- 
ing program. But there is no T such that {S}T{H}. The conditions 
in this figure, as well as those in Figures [J_3] and [14] include the 
unstated conjunct that the result of appending the output stream to 
the merge of the input streams is equal to the merge of the input 
streams in the initial state. 

As always with matrix code, we start with the conditions. Which 
do we need, in addition to the (?,?) and (e,e) given by the 
specification? For each of the input streams there are three states 
of information: 



3 With the one exception when the left input stream runs out at the same 
time as, or before, the right input stream. 



• ? 

• e 

• x : ? for some first element x 

It is to be expected that the two input streams can assume each 
of the three information states independently, for a total of nine 
conditions. 

It is desirable that the initial condition (?,?) of minimal in- 
formation does not arise during a computation of the code matrix. 
Under the assumption that we can avoid this there will be only rows 
for the eight other conditions. By the time we will have populated 
the columns for these eight conditions we will see whether this as- 
sumption was justified. 

This problem is easy because the conditions are determined by 
the nature of the problem. For each condition there is an obvious 
and easy-to-realize revisiting condition. If there is at least one 
unknown input stream at least one of them has to become known 
before revisiting. If both input streams are known, then at least one 
of them has to have its first element transferred to output before 
revisiting. See Figure Q5] where the transitions have been chosen 
to conform to the revisiting requirements. As each column either 
has no guard or two complementary guards, no additional rows are 
needed. 

The translation of this table to C++ is given below. As the order 
of the translations of the columns is immaterial, we have placed 
them in alphabetic order by label. 



void mMergeO { 
int u,v; 

typedef enum{S,A,B,C,D,E,F,G,H} State; 
State state = S; // control state 
while (true) { 
switch(state) { 



case 


A 


state = 


(getR(v))?C:D; break; 


case 


B 


if (getR(v)) {putRO; state = B;> 


else state = 


H; break; 


case 


C 


if (u <= 


= v) {putLO; state = E;} 


else {putRO 


state = A;} break; 


case 


D 


putLQ ; 


state = F; break; 


case 


E 


state = 


getL(u)?C:G; break; 


case 


F 


state = 


getL(u)?D:H; break; 


case 


G 


putRO ; 


state = B; break; 


case 


H 


return; 




case 


S 


state = 


getL(u)?A:B; break; 



} 

} 

y 



Figure 10. A C++ function for merging two streams translated 
from Figure[l4l 



The reason for developing a code matrix for the merge problem 
was the desire to avoid the superfluous tests of a function like the 
eMerge listed in Figure [8] To see in how far mMerge improves in 
this respect we have run both functions on the same set of pairs 
of input streams and counted the calls executed in both merge 
functions. 

Such comparisons are of course dependent on the nature of the 
input streams. For example, the more equal in length the input 
streams are, the more favourable for mMerge. Accordingly we have 
used a random-number generator to determine the lengths of the 
input streams. The input streams themselves are monotonically 
increasing with random increments. 





getL 


getR 


putL 


putR 


eMerge 


1756 


2691 


871 


1819 


mMerge 


872 


1821 


871 


1819 


eMerge 


1067 


830 


655 


410 


mMerge 


656 


411 


655 


410 


eMerge 


3261 


735 


2894 


365 


mMerge 


2895 


366 


2894 


365 


eMerge 


1355 


1024 


844 


509 


mMerge 


845 


510 


844 


509 



Each pair of successive lines gives the result of running eMerge 
and mMerge on the same pair of input streams. The lengths of the 
streams are not listed separately, as they are equal to the number of 
calls to putL and putR shown in the table. 

A merge function needs to make at least one call to getL (getR) 
for every element of the left (right) input stream. It can be seen that 
mMerge remains close to this minimum, while eMerge does not. 

This example is notable in that matrix code yields an unfamiliar, 
test-optimal algorithm by default. Structured programming tends to 
reduce the number of control states. Matrix code lacks this bias: in 
its use it is natural to introduce control states as needed to serve as 
memory for test outcomes. 

10. Related work 

The following comment has been made on Matrix Code: "Although 
it reeks of flow charts, the proposal has some merit." The comment 
has some merit: flow charts are indeed closely related to Matrix 
Code. Flow charts were widely used as an informal programming 
notation from the early 1950s to 1970. Floyd [8] showed how as- 
sertions and verification conditions can prove a flow chart partially 
correct. Hoare 1 10] introduced the notation of triples for the ver- 
ification conditions and cast Floyd's method in the form of infer- 
ence rules for control structures such as while ... do ... and 
if ... then . . . else . . . 

Dijkstra observed that verifying assertions are difficult to find 
for existing code, so that an attempt at verification is a costly un- 
dertaking with an uncertain outcome. He argued |4, 5] that code 
and correctness should be "developed in parallel". The proposal 
seems to have found no response, if only for the lack of specifics 
in the proposal. Given the fact that Dijkstra's proposal was con- 
sidered unrealistically Utopian, and still is, it is interesting to read 
what seems to be the first treatise | 9] on programming in the mod- 
ern sense, published in 1946. Here programs are expressed in the 
form of flow diagrams. At first sight one might think that these are 
flow charts under another name. This is not the case: flow diagrams 
consist of executable code integrated with assertions, with the un- 
derstanding that a consistent flow diagram proves the correctness 
of the computations performed by it. 

The imperative part of a flow diagram was translated to machine 
code (this was before the appearance of assemblers). I found no in- 
dication in 1 9] that it was even contemplated to split off the imper- 
ative part of the flow diagram. Thus we see that what was a vague 
proposal US], ar, d regarded as unrealistically Utopian in 1970, was 
fully worked out in 1946 and may have become a practical reality 
in 1951 when the IAS machine became operational. 

By the time flow charts appeared, the proof part of flow di- 
agrams had been dropped. And apparently forgotten, for Floyd's 
discovery was published in 1967 and universally acknowledged as 
such. Floyd's format is rather different, and, in our opinion, prefer- 
able to the flow diagrams of |9]. Matrix Code can be regarded as 
a simplification of Floyd's flow chart annotated with assertions, a 
simplification made possible by the use of transitions that provide 
a common generalization of statements and tests. Apt and Schaerf 



unify statements and tests in their nondeterministic control struc- 
tures d. 

Code matrices can be regarded as generalized Finite-State Au- 
tomata. The control states of code matrices are similar to the states 
in Finite-State Automata; the data states have no counterpart in 
FSAs. Data states can contain variables of widely varying types. 
These can include streams of characters, so that code matrices can 
simulate FSAs with input and/or output. This possibility makes 
Matrix Code reminiscent of Dana Scott's proposal I13ll to put an 
end to the proliferation of new variations of FS A by replacing them 
by programs defined to run on suitably defined machines. 

In spite of Scott's injunction, variants of FSA continued to 
appear. Of special interest in this context are labeled transition 
systems which are used to model and verify reactive systems 0]. 
Here the set of states is often not finite and there is typically no 

halt state. Such systems are specified by rules of the form P — > Q 
to indicate the possibility of a transition from state P to state Q 
accompanied by action A. Mathematically the rules are viewed as 
a ternary relation containing triples consisting of P, A, and Q. This 
is of course unobjectionable, but the alternative view of the rules as 
constituting a matrix indexed by states, containing in this instance 
A as element indexed by P and Q has the advantage of connecting 
the theory to that of semilinear programming in the sense of Parker. 
Another variant of FSA are the augmented transition networks used 
in linguistics (15]. 

The property that a code matrix is both a set of logical formulas 
and an executable program is reminiscent of logic programming, 
especially its aspect of separating logic from control lllll . A special 
form of logic program corresponding to imperative programs was 
investigated in (3fl. The modification of flow charts by means of 
binary relations was introduced in [ 14]. 

11. Conclusions 

In this paper we write programs as matrices with binary relations 
as elements. These matrices can be regarded as transformations in a 
generalized vector space, where vectors have assertions about data 
states as elements. Computations of the programs are characterized 
by powers of the matrix and verified assertions show up as gener- 
alized eigenvectors of the matrix. Such results may be dismissed 
as frivolous theorizing. It seems to us that they are related to the 
following practical benefits. 

Our motivation was to address the fact that imperative pro- 
gramming is in an unsatisfactory state compared to functional and 
logic programming. In the latter paradigms, implementation is, or 
is close to, specification. In imperative programming the relation 
between implementation and specification is the verification prob- 
lem, a problem considered too hard for the practising programmer. 
We proposed Matrix Code as an imperative programming language 
where the same construct can be read as logical formula and can 
serve as basis for a routine translation to Java or C. 

Another practical benefit is that it seems possible in some cases 
to develop algorithms incrementally by small, obvious steps from 
the specification. In this paper we go through such steps for an 
algorithm to fill a table with prime numbers using the method of 
trial division. Whether or not this success is an exceptional case, it 
seems certain that progress has been made in the direction of the 
old dream according to which the production of verified code is 
facilitated by developing proof and code in parallel. 
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Jj : 


A : 


S: p[0. .N-l] 
exists & N>1 






k >= N 




H: p[0. .N-l] 
contains the 
first N primes 


p [n] *p [n] > j ; 
p[k++]=j 




p[0] = 2; p[l] 
= 3; k = 2 


A: p[0. .k-1] 
contains the 
first k primes 
& k <= N 




k<N; j = 
p[k-l]+2; n=0 




B: A & k<N & 
relB(p,k,n, j) 



Figure 11. In column A we have added a transition in column A for the case that k < N. In that case we can start finding the next prime 
after p[k-l] because we know that there is enough space in p to store it. relB(p,k,n, j) means that there is no prime between the last 
prime found and j and that n<k, and that j is not divided by any prime in p [0 . . n] . 



C: 


B: 


A: 


S: p[0. .N-l] 
exists & N>1 








k >= N 




H: p[0. .N-l] 
contains the 
first N primes 




p [n] *p [n] > j ; 
p[k++]=j 




p[0] = 2; p[l] 
= 3; k = 2 


A: p[0. .k-1] 
contains the 
first k primes 
& k <= N 


jXpDa+1] !=0; 
n++ 




k<N; j = 
p[k-l]+2; n=0 




B: A & k<N & 
relB(p,k,n, j) 


j%p[n+l]==0; j 
+= 2; n=0 


p [n] *p [n] <= j 






C: B & 

p [n] *p [n] <= j 



Figure 12. This figure is both a general example of a code matrix and the final stage of the development consisting of the sequence of Figures 
[5][6] andl 1 1 1 Change from Figure [TTI row and column with label C are added. There are no incomplete columns. This, as well as each of the 
previous versions is partially correct, as implied by the validity of the verification condition for each of the null matrix elements. The absence 
of incomplete columns opens the possibility of total correctness, but does not prove it. 



A 


S: (?,?) 








H: (e,e) 




getL(u) 


A : (u: ? , ?) 




!getL(u) 


B:(e,?) 


getR(v) 




C: (u:?,v:?) 


!getR(v) 




D: (u:?,e) 



Figure 13. See Figure [9] An input stream needs to be tested; the left one is chosen arbitrarily. This gives rise to new conditions. Columns 
for these will cause addition of yet more conditions. See Figure [74l 
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F 


E 


D 


C 


B 


A 


S: (?,?) 






!getL(u) 








!getR(v) 






H: (e,e) 










u>v; 
putRO 






getL(u) 


A: (u:?,?) 


putRO 










getR(v) ; 
putRO 




!getL(u) 


B : (e , ?) 






getL(u) 








getR(v) 




C: (u:?,v:?) 




getL(u) 










!getR(v) 




D: (u:?,e) 










u <= v; 
putLO 








E: (?,v:?) 








putLO 










F:(?,e) 






!getL(u) 












G: (e,v:?) 



Figure 14. The complete code matrix for the merging problem, continuing Figuresl9landll3l 



